Search CORE

36 research outputs found

Patterns of Scalable Bayesian Inference

Author: Adams Ryan P.
Angelino Elaine
Johnson Matthew James
Publication venue
Publication date: 01/01/2016
Field of study

Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward

arXiv.org e-Print Archive

Crossref

CERN Document Server

Accelerating MCMC via Parallel Predictive Prefetching

Author: Adams Ryan P.
Angelino Elaine
Kohler Eddie
Seltzer Margo
Waterland Amos
Publication venue
Publication date: 27/03/2014
Field of study

We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast, iterative approximations to the target density to speculatively evaluate many potential future steps of the chain in parallel. The approach can accelerate computation of the target distribution of a Bayesian inference problem, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores

arXiv.org e-Print Archive

CiteSeerX

Accelerating Markov chain Monte Carlo via parallel predictive prefetching

Author: C© Elaine Lee Angelino
Elaine Lee Angelino
Publication venue: 'Harvard University Botany Libraries'
Publication date: 21/10/2014
Field of study

We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. This dissertation demonstrates that MCMC inference can be accelerated in a model of parallel computation that uses speculation to predict and complete computational work ahead of when it is known to be useful. By exploiting fast, iterative approximations to the target density, we can speculatively evaluate many potential future steps of the chain in parallel. In Bayesian inference problems, this approach can accelerate sampling from the target distribution, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Excitability Constraints on Voltage-Gated Sodium Channels

Author: Elaine Angelino
Michael P Brenner
Naama Barkai
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

We study how functional constraints bound and shape evolution through an analysis of mammalian voltage-gated sodium channels. The primary function of sodium channels is to allow the propagation of action potentials. Since Hodgkin and Huxley, mathematical models have suggested that sodium channel properties need to be tightly constrained for an action potential to propagate. There are nine mammalian genes encoding voltage-gated sodium channels, many of which are more than ≈90% identical by sequence. This sequence similarity presumably corresponds to similarity of function, consistent with the idea that these properties must be tightly constrained. However, the multiplicity of genes encoding sodium channels raises the question: why are there so many? We demonstrate that the simplest theoretical constraints bounding sodium channel diversity—the requirements of membrane excitability and the uniqueness of the resting potential—act directly on constraining sodium channel properties. We compare the predicted constraints with functional data on mammalian sodium channel properties collected from the literature, including 172 different sets of measurements from 40 publications, wild-type and mutant, under a variety of conditions. The data from all channel types, including mutants, obeys the excitability constraint; on the other hand, channels expressed in muscle tend to obey the constraint of a unique resting potential, while channels expressed in neuronal tissue do not. The excitability properties alone distinguish the nine sodium channels into four different groups that are consistent with phylogenetic analysis. Our calculations suggest interpretations for the functional differences between these groups

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

A linear and regularized ODF estimation algorithm to recover multiple fibers in Q-Ball imaging

Author: Angelino Elaine
Deriche Rachid
Descoteaux Maxime
Fitzgibbons Shaun
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

Due the well-known limitations of diffusion tensor imaging (DTI), high angular resolution diffusion imaging is currently of great interest to characterize voxels containing multiple fiber crossings. In particular, Q-ball imaging (QBI) is now a popular reconstruction method to obtain the orientation distribution function (ODF) of these multiple fiber distributions. The latter captures all important angular contrast by expressing the probability that a water molecule will diffuse into any given solid angle. However, QBI and other high order spin displacement estimation methods involve non-trivial numerical computations and lack a straightforward regularization process. In this paper, we propose a simple linear and regularized analytic solution for the Q-ball reconstruction of the ODF. First, the signal is modeled with a physically meaningful high order spherical harmonic series by incorporating the Laplace-Beltrami operator in the solution. This leads to an elegant mathematical simplification of the Funk-Radon transform using the Funk-Hecke formula. In doing so, we obtain a fast and robust model-free ODF approximation. We validate the accuracy of the ODF estimation quantitatively using the multi-tensor synthetic model where the exact ODF can be computed. We also demonstrate that the estimated ODF can recover known multiple fiber regions in a biological phantom and in the human brain. Another important contribution of the paper is the development of ODF sharpening methods. We show that sharpening the measured ODF enhances each underlying fiber compartment and considerably improves the extraction of fibers. The proposed techniques are simple linear transformations of the ODF and can easily be computed using our spherical harmonics machinery

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

Apparent Diffusion Coefficients from High Angular Resolution Diffusion Images: Estimation and Applications

Author: Angelino Elaine
Deriche Rachid
Descoteaux Maxime
Fitzgibbons Shaun
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

High angular resolution diffusion imaging (HARDI) has recently been of great interest in characterizing non-Gaussian diffusion processes. In the white matter of the brain, non-Gaussian diffusion occurs when fiber bundles cross, kiss or diverge within the same voxel. One important goal in current research is to obtain more accurate fits of the apparent diffusion processes in these multiple fiber regions, thus overcoming the limitations of classical diffusion tensor imaging (DTI). This paper presents an extensive study of high order models for apparent diffusion coefficient estimation and illustrates some of their applications. In particular, we first develop the appropriate mathematical tools to work on noisy HARDI data. Using a meaningful modified spherical harmonics basis to capture the physical constraints of the problem, we propose a new regularization algorithm to estimate a diffusivity profile smoother and closer to the true diffusivities without noise. We define a smoothing term based on the Laplace-Beltrami operator for functions defined on the unit sphere. The properties of the spherical harmonics are then exploited to derive a closed form implementation of this term into the fitting procedure. We next derive the general linear transformation between the coefficients of a spherical harmonics series of order

\ell

and the independent elements of the rank-

\ell

high order diffusion tensor. An additional contribution of the paper is the careful study of the state of the art anisotropy measures for high order formulation models computed from spherical harmonics or tensor coefficients. Their ability to characterize the underlying diffusion process is analyzed. We are able to reproduce published results and also able to recover voxels with isotropic, single fiber anisotropic and multiple fiber anisotropic diffusion. We test and validate the different approaches on apparent diffusion coefficients from synthetic data, from a biological phantom and from a human brain dataset

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

Recommended from our members

StarFlow: A Script-Centric Data Analysis Environment

Author: Angelino Elaine Lee
Seltzer Margo I.
Yamins Daniel Louis Kanef
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/04/2011
Field of study

We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe a range of real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.Engineering and Applied Science

Harvard University - DASH

Flash Caching on the Storage Client

Author: Angelino Elaine Lee
Holland David A.
Seltzer Margo I.
Wald Gideon
Publication venue: USENIX Association
Publication date: 19/11/2013
Field of study

Flash memory has recently become popular as a caching medium. Most uses to date are on the storage server side. We investigate a different structure: flash as a cache on the client side of a networked storage environment. We use trace-driven simulation to explore the design space. We consider a wide range of configurations and policies to determine the potential client-side caches might offer and how best to arrange them. Our results show that the flash cache writeback policy does not significantly affect performance. Write-through is sufficient; this greatly simplifies cache consistency handling. We also find that the chief benefit of the flash cache is its size, not its persistence. Cache persistence offers additional performance benefits at system restart at essentially no runtime cost. Finally, for some workloads a large flash cache allows using miniscule amounts of RAM for file caching (e.g., 256 KB) leaving more memory available for application use.Engineering and Applied Science

CiteSeerX

Harvard University - DASH